ucb algorithm
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.52)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > Canada (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (0.94)
- Information Technology > Data Science > Data Mining > Big Data (0.48)
- North America > United States > Michigan (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.94)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.52)
41bfd20a38bb1b0bec75acf0845530a7-Reviews.html
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: ******* The paper considers the following general sequential learning problem: a set of actions \cal{A} and a parametric set F of bounded functions mapping actions to the reals are given. At each time step, a subset \cal{A}_t \subset \cal{A} is provided, one has to choose one action A_t in \cal{A}_t, and then gets a reward R_t. The reward is assumed to come from a sub-Gaussian distribution with mean f_\theta(A_t), where f_\theta \in F is a fixed, unknown, function. The authors then introduce in Section 4 the notion of eluder-dimension that is designed to capture how difficult it is to infer the value of one action from previously observed ones.